60 research outputs found

    The Internet AS-Level Topology: Three Data Sources and One Definitive Metric

    Full text link
    We calculate an extensive set of characteristics for Internet AS topologies extracted from the three data sources most frequently used by the research community: traceroutes, BGP, and WHOIS. We discover that traceroute and BGP topologies are similar to one another but differ substantially from the WHOIS topology. Among the widely considered metrics, we find that the joint degree distribution appears to fundamentally characterize Internet AS topologies as well as narrowly define values for other important metrics. We discuss the interplay between the specifics of the three data collection mechanisms and the resulting topology views. In particular, we show how the data collection peculiarities explain differences in the resulting joint degree distributions of the respective topologies. Finally, we release to the community the input topology datasets, along with the scripts and output of our calculations. This supplement should enable researchers to validate their models against real data and to make more informed selection of topology data sources for their specific needs.Comment: This paper is a revised journal version of cs.NI/050803

    On Compact Routing for the Internet

    Full text link
    While there exist compact routing schemes designed for grids, trees, and Internet-like topologies that offer routing tables of sizes that scale logarithmically with the network size, we demonstrate in this paper that in view of recent results in compact routing research, such logarithmic scaling on Internet-like topologies is fundamentally impossible in the presence of topology dynamics or topology-independent (flat) addressing. We use analytic arguments to show that the number of routing control messages per topology change cannot scale better than linearly on Internet-like topologies. We also employ simulations to confirm that logarithmic routing table size scaling gets broken by topology-independent addressing, a cornerstone of popular locator-identifier split proposals aiming at improving routing scaling in the presence of network topology dynamics or host mobility. These pessimistic findings lead us to the conclusion that a fundamental re-examination of assumptions behind routing models and abstractions is needed in order to find a routing architecture that would be able to scale ``indefinitely.''Comment: This is a significantly revised, journal version of cs/050802

    GT: Picking up the Truth from the Ground for Internet Traffic

    Get PDF
    Much of Internet traffic modeling, firewall, and intrusion detection research requires traces where some ground truth regarding application and protocol is associated with each packet or flow. This paper presents the design, development and experimental evaluation of gt, an open source software toolset for associating ground truth information with Internet traffic traces. By probing the monitored host's kernel to obtain information on active Internet sessions, gt gathers ground truth at the application level. Preliminary exper- imental results show that gt's effectiveness comes at little cost in terms of overhead on the hosting machines. Furthermore, when coupled with other packet inspection mechanisms, gt can derive ground truth not only in terms of applications (e.g., e-mail), but also in terms of protocols (e.g., SMTP vs. POP3

    Mapping peering interconnections to a facility

    Get PDF
    Annotating Internet interconnections with robust physical coordinates at the level of a building facilitates network management including interdomain troubleshooting, but also has practical value for helping to locate points of attacks, congestion, or instability on the Internet. But, like most other aspects of Internet interconnection, its geophysical locus is generally not public; the facility used for a given link must be inferred to construct a macroscopic map of peering. We develop a methodology, called constrained facility search, to infer the physical interconnection facility where an interconnection occurs among all possible candidates. We rely on publicly available data about the presence of networks at different facilities, and execute traceroute measurements from more than 8,500 available measurement servers scattered around the world to identify the technical approach used to establish an interconnection. A key insight of our method is that inference of the technical approach for an interconnection sufficiently constrains the number of candidate facilities such that it is often possible to identify the specific facility where a given interconnection occurs. Validation via private communication with operators confirms the accuracy of our method, which outperforms heuristics based on naming schemes and IP geolocation. Our study also reveals the multiple roles that routers play at interconnection facilities; in many cases the same router implements both private interconnections and public peerings, in some cases via multiple Internet exchange points. Our study also sheds light on peering engineering strategies used by different types of networks around the globe

    Weighted network modules

    Get PDF
    The inclusion of link weights into the analysis of network properties allows a deeper insight into the (often overlapping) modular structure of real-world webs. We introduce a clustering algorithm (CPMw, Clique Percolation Method with weights) for weighted networks based on the concept of percolating k-cliques with high enough intensity. The algorithm allows overlaps between the modules. First, we give detailed analytical and numerical results about the critical point of weighted k-clique percolation on (weighted) Erdos-Renyi graphs. Then, for a scientist collaboration web and a stock correlation graph we compute three-link weight correlations and with the CPMw the weighted modules. After reshuffling link weights in both networks and computing the same quantities for the randomised control graphs as well, we show that groups of 3 or more strong links prefer to cluster together in both original graphs.Comment: 19 pages, 7 figure

    Inferring persistent interdomain congestion

    Get PDF
    There is significant interest in the technical and policy communities regarding the extent, scope, and consumer harm of persistent interdomain congestion. We provide empirical grounding for discussions of interdomain congestion by developing a system and method to measure congestion on thousands of interdomain links without direct access to them. We implement a system based on the Time Series Latency Probes (TSLP) technique that identifies links with evidence of recurring congestion suggestive of an under-provisioned link. We deploy our system at 86 vantage points worldwide and show that congestion inferred using our lightweight TSLP method correlates with other metrics of interconnection performance impairment. We use our method to study interdomain links of eight large U.S. broadband access providers from March 2016 to December 2017, and validate our inferences against ground-truth traffic statistics from two of the providers. For the period of time over which we gathered measurements, we did not find evidence of widespread endemic congestion on interdomain links between access ISPs and directly connected transit and content providers, although some such links exhibited recurring congestion patterns. We describe limitations, open challenges, and a path toward the use of this method for large-scale third-party monitoring of the Internet interconnection ecosystem

    Snazer: the simulations and networks analyzer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Networks are widely recognized as key determinants of structure and function in systems that span the biological, physical, and social sciences. They are static pictures of the interactions among the components of complex systems. Often, much effort is required to identify networks as part of particular patterns as well as to visualize and interpret them.</p> <p>From a pure dynamical perspective, simulation represents a relevant <it>way</it>-<it>out</it>. Many simulator tools capitalized on the "noisy" behavior of some systems and used formal models to represent cellular activities as temporal trajectories. Statistical methods have been applied to a fairly large number of replicated trajectories in order to infer knowledge.</p> <p>A tool which both graphically manipulates reactive models and deals with sets of simulation time-course data by aggregation, interpretation and statistical analysis is missing and could add value to simulators.</p> <p>Results</p> <p>We designed and implemented <it>Snazer</it>, the simulations and networks analyzer. Its goal is to aid the processes of visualizing and manipulating reactive models, as well as to share and interpret time-course data produced by stochastic simulators or by any other means.</p> <p>Conclusions</p> <p><it>Snazer </it>is a solid prototype that integrates biological network and simulation time-course data analysis techniques.</p

    Ten Things Lawyers should Know about Internet Research

    No full text

    Remote physical device fingerprinting

    No full text
    We introduce the area of remote physical device fingerprinting, or fingerprinting a physical device, as opposed to an operating system or class of devices, remotely, and without the fingerprinted device’s known cooperation. We accomplish this goal by exploiting small, microscopic deviations in device hardware: clock skews. Our techniques do not require any modification to the fingerprinted devices. Our techniques report consistent measurements when the measurer is thousands of miles, multiple hops, and tens of milliseconds away from the fingerprinted device, and when the fingerprinted device is connected to the Internet from different locations and via different access technologies. Further, one can apply our passive and semi-passive techniques when the fingerprinted device is behind a NAT or firewall, and also when the device’s system time is maintained via NTP or SNTP. One can use our techniques to obtain information about whether two devices on the Internet, possibly shifted in time or IP addresses, are actually the same physical device. Example applications include: computer forensics; tracking, with some probability, a physical device as it connects to the Internet from different public access points; counting the number of devices behind a NAT even when the devices use constant or random I
    corecore